Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for ccm and cmx out of sync #171

Merged
merged 3 commits into from
Jul 10, 2020
Merged

Fix for ccm and cmx out of sync #171

merged 3 commits into from
Jul 10, 2020

Conversation

spficklin
Copy link
Member

@spficklin spficklin commented Jul 10, 2020

This PR fixes issue #157. The problem potentially occurs with the corrpower or cond-test analytics, but can only happen in either case with a sparse matrix. The problem is if the cmx or ccm files are sparse, the processing of pairs is not starting at the first pair. It doesn't create incorrect results, it just offsets the counting in the for loops which causes the ccm to have fewer pairs than the original cmx. The unfortunate side effect, other than the warning message is that a few potential edges may be missing from the output files after an extract.

This code fixes that problem by adding a new function Matrix::Pair::readFirst() which allows each analytic to first find the first real pair in a sparse matrix before beginning the first work block. I also adjusted the code a bit to ensure that if there ever was another future problem with pairs missing in one file that moving between pairs in a files can recover in the event of an out of sync problem.

@spficklin
Copy link
Member Author

@JohnHadish can you do a functional test to make sure the problem is fixed?

@bentsherman would you be able to do a quick code check to make sure nothing weird stands out to you?

@spficklin spficklin added the bug label Jul 10, 2020
@spficklin spficklin added this to the Release 3.4.2 milestone Jul 10, 2020
@spficklin spficklin mentioned this pull request Jul 10, 2020
@JohnHadish
Copy link
Collaborator

JohnHadish commented Jul 10, 2020

This does not appear to work for me. After I installed the 157_cmx_ccm_sync branch, I just got a much longer error message. The old version did not have any warning messages before the end, and did not segmentation fault.

To reproduce error, use my test KINC repo located at RUN_KINC_DEFAULT on my personal gitlab account.I just gave @spficklin premissions for it. To run my test, just clone the repo, and run ./02-Run_KINC.sh

ERROR MESSAGES

BEFORE FIX ERROR

99%	2s	
warning: cmx and ccm are out of sync
100%
Removing biased edges from the extract network using KINC.R.

AFTER FIX ERROR -- last few lines (it was much longer)

warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 665 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 680 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 688 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 693 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 701 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 704 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 708 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 710 ).
warning: cmx and ccm are out of sync at cmx coordinate ( 773 , 711 ).
[jah-desktop:09966] *** Process received signal ***
[jah-desktop:09966] Signal: Segmentation fault (11)
[jah-desktop:09966] Signal code:  (128)
[jah-desktop:09966] Failing at address: (nil)
[jah-desktop:09966] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7fe9a28a0210]
[jah-desktop:09966] [ 1] kinc(+0x85694)[0x555dece19694]
[jah-desktop:09966] [ 2] kinc(+0x3907c)[0x555decdcd07c]
[jah-desktop:09966] [ 3] kinc(+0x39f21)[0x555decdcdf21]
[jah-desktop:09966] [ 4] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic15AbstractManager11writeResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EEi+0x50)[0x7fe9b3146400]
[jah-desktop:09966] [ 5] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic6Single11writeResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x16)[0x7fe9b313ec66]
[jah-desktop:09966] [ 6] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic13AbstractInput10saveResultEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x5b)[0x7fe9b31455eb]
[jah-desktop:09966] [ 7] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic9SimpleRun7addWorkEOSt10unique_ptrI22EAbstractAnalyticBlockSt14default_deleteIS3_EE+0x31)[0x7fe9b313c911]
[jah-desktop:09966] [ 8] /usr/local/lib/libacecore.so.3(_ZN3Ace8Analytic6Single7processEv+0x41)[0x7fe9b313eaf1]
[jah-desktop:09966] [ 9] /lib/x86_64-linux-gnu/libQt5Core.so.5(+0x2bf5b6)[0x7fe9a30585b6]
[jah-desktop:09966] [10] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN7QObject5eventEP6QEvent+0x1d5)[0x7fe9a304bcf5]
[jah-desktop:09966] [11] /usr/local/lib/libacecli.so.3(_ZN12EApplication6notifyEP7QObjectP6QEvent+0x25)[0x7fe9a3300c85]
[jah-desktop:09966] [12] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN16QCoreApplication15notifyInternal2EP7QObjectP6QEvent+0x18a)[0x7fe9a301f93a]
[jah-desktop:09966] [13] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN14QTimerInfoList14activateTimersEv+0x3d0)[0x7fe9a30768b0]
[jah-desktop:09966] [14] /lib/x86_64-linux-gnu/libQt5Core.so.5(+0x2de1e4)[0x7fe9a30771e4]
[jah-desktop:09966] [15] /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_main_context_dispatch+0x27d)[0x7fe99de6afbd]
[jah-desktop:09966] [16] /lib/x86_64-linux-gnu/libglib-2.0.so.0(+0x52240)[0x7fe99de6b240]
[jah-desktop:09966] [17] /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_main_context_iteration+0x33)[0x7fe99de6b2e3]
[jah-desktop:09966] [18] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE+0x65)[0x7fe9a3077565]
[jah-desktop:09966] [19] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE+0x12b)[0x7fe9a301e4db]
[jah-desktop:09966] [20] /lib/x86_64-linux-gnu/libQt5Core.so.5(_ZN16QCoreApplication4execEv+0x96)[0x7fe9a3026246]
[jah-desktop:09966] [21] /usr/local/lib/libacecli.so.3(_ZN12EApplication4execEv+0x5ba)[0x7fe9a33023ca]
[jah-desktop:09966] [22] kinc(+0x261df)[0x555decdba1df]
[jah-desktop:09966] [23] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fe9a28810b3]
[jah-desktop:09966] [24] kinc(+0x26c5e)[0x555decdbac5e]
[jah-desktop:09966] *** End of error message ***
./02-Run_KINC.sh: line 100:  9966 Segmentation fault      (core dumped) kinc run extract --emx "${PREFIX}.emx" --ccm "${PREFIX}.paf.ccm" --cmx "${PREFIX}.paf.cmx" --csm "${PREFIX}.csm" --format "tidy" --output "${PREFIX}.paf-th${th}-p${p}-rsqr${r2}.txt" --mincorr $th --maxcorr 1 --filter-pvalue $p --filter-rsquare $r2

@spficklin
Copy link
Member Author

@JohnHadish you will have to rerun the cond-test to fix the out of sync files and then run extract

@JohnHadish
Copy link
Collaborator

Yes, I confirm that this is what I did, here are my exact steps, I started with a fresh test directory for this to ensure I was not using old files:

cd KINC/
git pull
git checkout origin/157_cmx_ccm_sync
git pull
git status
sudo make
sudo make install
cd ..
git clone git@gitlab.com:JohnHadish/run_kinc_default.git
cd run_kinc_default
./02-Run_KINC.sh

@spficklin
Copy link
Member Author

Okay. It looks like the extract analytic was having similar issues. So, I applied a fix and tested it with your dataset @JohnHadish and it ran just fine. Can you test again?

Copy link
Collaborator

@JohnHadish JohnHadish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great now!

@JohnHadish JohnHadish merged commit 023f542 into develop Jul 10, 2020
@spficklin
Copy link
Member Author

Thanks @JohnHadish and @bentsherman for the quick reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants